Analysis of Pirate Attacks Over the Past 30 Years

Christopher Zabriski

Overview

In 2013 alone, the world economy suffered an estimated loss of 18 billion dollars from Somilian Pirates attacks. (CNN Buisness) Seeing as pirate attacks have such a large impact on global trade it is important to try to understand how to mitigate such attacks from happening in the future. In this report, I will show a step-by-step in-depth analysis of how to collect, process, visualize, analyze, and draw conclusions from data using what information has been recorded on pirate attacks over the last 30 years.

1) Data Collection

The first step in the data life cycle is data collection. In this instance, our data is sotred in a CSV file so we will be using pandas to cleanly extract and manipulate it.

2) Data Parsing and Management

The second step in the data life cycle involves processing raw data into useable information. In this section we will be removing columns of the dataframe that we will not be using, breaking apart the date of each attack into seperate year and month values, and sorting the dataframe by year from oldest to most recent. Next, we will place the different types of vessels that get attacked into a subset of categories that we will use to find trends later. Finaly, we will condense each pirate attack's location from nearest country to continent.

3) Data Visualization

The third step in the data life cycle is in regards to visualizing data. In this step we will be using several graphs and maps to see how pirate attacks have changed over time in regards to number of attacks, attacks on each vessel type, or attacks at a certian location.

Here we can see that pirate attacks have spiked in 2000, 2003, and 2009-2011, as the graph creates a sort of M shape indicating a large fluctuation of attacks. As we look more at the shape of the graph we can start to recognize the potential for some fourm of a polynomial trend that the number of pirate attacks per year seems to follow.

From what information we have we can see that the majority of pirates focus their attacks on larger vessels such as tankers, carriers, container ships, and cargo ships. This would lead us to assume that these pirate bands contain enough members to take over such large ships, which indicates an element of orginization and a leadership higherarchy necessary to coordinate attacks of this size. While we can see that some attacks do occur on smaller vessels, the majority of the problem is defending ships with large amounts of cargo.

Here we can observe that tankers, while being the type of vessel that is most frequently attacked by pirate orginizations, have slowly seen a decrease in attacks over time. However, it does not seem like we can observe a relationship between time and the other types of ships that are attacked as their fluctuations do not look to have trends.

From this map we can see that from 1995-1999 the majority of recorded pirate attacks occured in Southeast Asia, with other smaller frequent locations being India, and the East and West coasts of Africa.

In the second map we can see a dramatic increase in pirate attacks not only in Southeast Asia, but also along the African coasts and Northern regions of South America. This is also the first, and only, time we see pirate attacks on the United States.

In the third map we see that pirate attacks in Southeast Asia have been cut in half. We now see a dramatic increase in attacks in the Arabian sea.

In the fourth map we can solidy the trends that the majority of pirate attacks are located in Southeast Asia, the Arabian Sea, and along the coasts of Africa, with a few attacks in the caribbean and Northern parts of South America.

In the fifth period the different types of ships that were attacked start to be recorded. In this map we label the different categories of ships as follows: tanker=blue, carrier=orange, container=green, cargo=red, other=purple, tug=beige, supply=pink. In this period we see a dramatic decrease in attacks in the Arabian Sea and a slight increase in China's Northeast coast.

In this graph we can see how volitile the number of pirate attacks in Asia is subject to change. We can observe what seems to be two large spikes in pirate activity in the continent around 2000 and 2011, however, after these spikes pirate activity seems to have dropped slightly. Another thing we can observe is that there is somewhat of a slight linear increase in pirate activity in Africa.

Analysis and Hypothesis Testing

The fourth step in the data life cycle is analyzing data through the use of machine learning and drawing conclusions from it. Here we will be using numpy's polyfit() command to find polynomial regression models for total pirate attacks over time and pirate attacks by continent over time.

From this graph we can observe a loose fitting line of regression that indicates that the number of pirate attacks moving forward will slowly plateau. This model identifys the spikes in pirate attacks during 2000, 2003, and 2009-2011, and the following sharp decrease in attacks from 2004-2008 as outliers.

From this graph we can see that the polyfit model predicts that South America will continue to have a low number of pirate attacks every year and that pirate attacks in Africa will continue to increase. However in this graph we can see that the polyfit model predicts that Asia will have a decrease in pirate attacks, even though Asia seems to be the largest contributing factor.

Insight

After gathering, manipulating, visualizing, and analyzing the data on pirate attacks over the past 30 years we can come to the conclusion that, for the most part, pirate attacks will continue to occur at the rate they have been in the recent past unless more preventitive measures are put in place. We observed several spikes in pirate activity and one potential soultion would be to look into Asian policy decisions around 2010-2012 regarding piracy to atempt to find effective policy to combat this issue as since this period of time Asia has had significantly less pirate attacks.

References